System and method for determining drilling activity

ABSTRACT

A method and system for interpreting oilfield process data, including drilling rig data and/or the like, is described, the method and system including use of a knowledge representation containing a representation of uncertainty in the oilfield process operations.

Embodiments of the present invention relate to interpreting data, including but not limited to interpreting data from oilfield applications—which data may include but is not limited to drilling data, production data, well data, completions data, drill string data, wellbore data, logging data and/or the like—using a knowledge representation that contains representation of uncertainties.

BACKGROUND

The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.

In oilfield applications, the drilling process can be impeded by a wide variety of problems. Accurate measurements of downhole conditions, rock properties and surface equipment allow many drilling risks to be minimized and may also be used for detecting when a problem has occurred. At present, most problem detection is the result of human vigilance, but detection probability is often degraded by fatigue, high workload or lack of experience.

Merely by way of example, in oilfield applications, some limited techniques have been used for detecting the occurrence of one of two possible rig states using a single input channel. In one example, a technique may be used to automatically detect if a drill pipe for drilling a hydrocarbon well is either “in slips” or “not in slips”. This information may be used to gain accurate control of depth estimates, for example in conjunction with activities such as measurement-while-drilling (MWD) or mud logging. To tell whether the drill pipe is “in slips,” the known technique generally only uses a single input channel of hookload data measured on the surface. Another example of making a determination between two possible rig states is a technique used to predict if the drill bit is “on bottom” or “not on bottom.” Again, this method makes use of only a single input channel, namely block position, and is only used to detect one of two “states” of the drilling rig.

In the oilfield industry there is a need to automate process/applications and to monitor the automated processes and applications. This automation and monitoring may require monitoring of one or more streams of data and interpretation of the data.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is described in conjunction with the appended figures.

FIG. 1 shows a drilling system using automatic rig state detection, according to one embodiment of the present invention.

FIG. 2 is a schematic-type illustration of a processing system for processing data to determine an oilfield application state, according to one embodiment of the present invention.

FIG. 3( a) is a screen shot showing graphs of several data channels collected during a drilling operation as may be processed in accordance with an embodiment of the present invention.

FIG. 3( b) is a screen shot showing a zoom-in on the graphs of several data channels collected during the drilling operation of FIG. 3( a) over a short time interval.

FIG. 3( c) is a screen shot showing graphs of several data channels collected during a drilling operation and interpretations including probabilities of particular drilling activities occurring based on the input data of FIG. 3( a) using a methodology in accordance with an embodiment of the present invention.

FIG. 3( d) is a zoom-in on the screen shot of FIG. 3( c) for a short time interval.

FIG. 4 is a high-level schematic illustration of a RIG data interpretation software program that may be used in accordance with an embodiment of the present invention to compute the interpretations illustrated in FIGS. 3( c) and 3(d).

FIG. 5 is a schematic-type illustration depicting use of drilling knowledge to interpret drilling data to produce an interpretation, in accordance with an embodiment of the present invention.

FIG. 6 is a screen dump of a portion of an ontology of drilling, in accordance with an embodiment of the present invention.

FIGS. 7 through 11 are schematic-type illustrations of activities in a drilling activity grammar as may be used in the interpretation methodology of FIG. 5 or the like, in accordance with an embodiment of the present invention.

FIG. 12 is a block diagram illustrating the components of a data interpretation program generated by a data interpretation program generator, in accordance with an embodiment of the present invention.

FIG. 13 is a block diagram illustrating a process for interpreting a time sequence of input data, to compute probability values for leaf states and for activities defined by an activity grammar, in accordance with an embodiment of the present invention.

FIG. 14 is a block diagram illustrating exemplary pseudo code implementing the process illustrated and discussed in conjunction with FIG. 13, in accordance with an embodiment of the present invention.

FIG. 15 is a block diagram illustration of a code for computing a current state probability vector, in accordance with an embodiment of the present invention.

FIG. 16 is a block diagram illustrating a relationship between the stochastic grammar illustrated in FIG. 5, the interpretation program code generator described in FIG. 5, and a TRANS-PROB matrix and a DATA-STATE-PROB matrix of the data interpretation program, in accordance with an embodiment of the present invention.

FIG. 17 is a grammar that is relied upon herein to provide an illustrative example of a process for generating the TRANS-PROB matrix, in accordance with an embodiment of the present invention.

FIG. 18 is an example of a TRANS-PROB matrix corresponding to the grammar of FIG. 17, in accordance with an embodiment of the present invention.

FIG. 19 is an abstraction of a grammar showing four leaf states without providing specific grammar rules for transitioning between these four leaf states, in accordance with an embodiment of the present invention. (This abstraction is provided merely by way of example for illustrative purposes).

FIG. 20 is a continuation of FIG. 19 and provides a table illustrating values for traits making up the configurations of the leaf activities of FIG. 19.

FIG. 21 illustrates the compatible configurations for the configurations defined in FIG. 20.

FIG. 22 provides a table of the values of the traits making up the configurations corresponding to six data values used in the example introduced in FIG. 19.

FIG. 23 illustrates the compatible configurations for the configurations defined in FIG. 22.

FIG. 24 is a table of sequence of flags that reflect whether a particular corresponding configuration of the state is compatible with the configuration of the data value, in accordance with an embodiment of the present invention.

FIG. 25 is a table illustrating an intermediate step in the computation of the DATA-STATE-PROB matrix corresponding to the example introduced in FIG. 19, in accordance with an embodiment of the present invention.

FIG. 26 is an illustration of a resulting matrix following a normalization operation, in accordance with an embodiment of the present invention.

FIG. 27 provides an example of a TRANS-PROB matrix provided merely by way of example for the purposes of illustrating the operation of the interpretation program.

FIG. 28 is an illustration of results of operation of an interpretation program 401 using the process described in conjunction with FIGS. 13 though 15 and the TRANS-PROB matrix of FIG. 27 and the DATA-STATE-PROB matrix of FIG. 26, in accordance with an embodiment of the present invention.

FIG. 29 illustrates some confusion matrices corresponding to the example given herein above, in accordance with an embodiment of the present invention.

FIG. 30 is a flow-chart illustrating a process of applying the data interpretation program introduced herein to evaluate multiple hypothesis relevant to the origin of a data set, in accordance with an embodiment of the present invention.

In the appended figures, similar components and/or features may have the same reference label. Further, various components of the same type may be distinguished by following the reference label by a dash and a second label that distinguishes among the similar components. If only the first reference label is used in the specification, the description is applicable to any one of the similar components having the same first reference label irrespective of the second reference label.

DETAILED DESCRIPTION

In the following detailed description, reference is made to the accompanying drawings that show, by way of illustration, specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention. It is to be understood that the various embodiments of the invention, although different, are not necessarily mutually exclusive. For example, a particular feature, structure, or characteristic described herein in connection with one embodiment may be implemented within other embodiments without departing from the scope of the invention. In addition, it is to be understood that the location or arrangement of individual elements within each disclosed embodiment may be modified without departing from the spirit and scope of the invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims, appropriately interpreted, along with the full range of equivalents to which the claims are entitled. In the drawings, like numerals refer to the same or similar functionality throughout the several views.

It should also be noted that in the development of any such actual embodiment, numerous decisions specific to circumstance must be made to achieve the developer's specific goals, such as compliance with system-related and business-related constraints, which will vary from one implementation to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming but would nevertheless be a routine undertaking for those of ordinary skill in the art having the benefit of this disclosure.

In this disclosure, the term “storage medium” may represent one or more devices for storing data, including read only memory (ROM), random access memory (RAM), magnetic RAM, core memory, magnetic disk storage mediums, optical storage mediums, flash memory devices and/or other machine readable mediums for storing information. The term “computer-readable medium” includes, but is not limited to portable or fixed storage devices, optical storage devices, wireless channels and various other mediums capable of storing, containing or carrying instruction(s) and/or data.

Embodiments of the present invention provide a method of describing oilfield operations in a knowledge representation that contains a grammar for interpreting oilfield application data. Merely by way of example, in some embodiments, methods of describing drilling operations in a knowledge representation that contains a grammar for interpreting drilling data are provided. However, the methods herein disclosed may be used on other oilfield applications, such as hydrocarbon production, well completions, well logging, well interpretation, recovery operations, stimulation or the like. The knowledge representation of embodiments of the present invention may include representation of uncertainty.

For example, given a representation of oilfield operations, such as drilling activities or the like, that have component subactivities, the representation may include probabilities for transitioning from one such subactivity to another. The method, when applied, may provide for an efficient way of interpreting input data to determine the probability that the input data is indicative of certain activities occurring and is therefore a valuable tool in analyzing an oilfield application, such as the operations of a drilling rig or the like.

FIG. 1 shows a drilling system 10 using automatic rig state detection, according to one embodiment of the present invention. Drill string 58 is shown within borehole 46. Borehole 46 is located in the earth 40 having a surface 42. Borehole 46 is being cut by the action of drill bit 54. Drill bit 54 is disposed at the far end of the bottomhole assembly 56 that is attached to and forms the lower portion of drill string 58. Bottomhole assembly 56 contains a number of devices including various subassemblies. According to an embodiment of the invention measurement-while-drilling (MWD) subassemblies are included in subassemblies 62. Examples of typical MWD measurements include direction, inclination, survey data, downhole pressure (inside the drill pipe, and outside or annular pressure), resistivity, density, andporosity. Also included is a subassembly 62 for measuring torque and weight on bit. The signals from the subassemblies 62 are preferably processed in processor 66. After processing, the information from processor 66 is communicated to pulser assembly 64. Pulser assembly 64 converts the information from processor 66 into pressure pulses in the drilling fluid. The pressure pulses are generated in a particular pattern which represents the data from subassemblies 62. The pressure pulses travel upwards though the drilling fluid in the central opening in the drill string and towards the surface system. The subassemblies in the bottomhole assembly 56 can also include a turbine or motor for providing power for rotating and steering drill bit 54. In different embodiments, other telemetry systems, such as wired pipe, fiber optic systems, acoustic systems, wireless communication systems and/or the like may be used to transmit data to the surface system.

The drilling rig 12 includes a derrick 68 and hoisting system, a rotating system, and a mud circulation system. The hoisting system which suspends the drill string 58, includes draw works 70, fast line 71, crown block 75, drilling line 79, traveling block and hook 72, swivel 74, and deadline 77. The rotating system includes kelly 76, rotary table 88, and engines (not shown). The rotating system imparts a rotational force on the drill string 58 as is well known in the art. Although a system with a kelly and rotary table is shown in FIG. 4, those of skill in the art will recognize that the present invention is also applicable to top drive drilling arrangements. Although the drilling system is shown in FIG. 4 as being on land, those of skill in the art will recognize that the present invention is equally applicable to marine environments.

The mud circulation system pumps drilling fluid down the central opening in the drill string. The drilling fluid is often called mud, and it is typically a mixture of water or diesel fuel, special clays, and other chemicals. The drilling mud is stored in mud pit 78. The drilling mud is drawn in to mud pumps (not shown), which pump the mud though stand pipe 86 and into the kelly 76 through swivel 74 which contains a rotating seal.

The mud passes through drill string 58 and through drill bit 54. As the teeth of the drill bit grind and gouges the earth formation into cuttings the mud is ejected out of openings or nozzles in the bit with great speed and pressure. These jets of mud lift the cuttings off the bottom of the hole and away from the bit 54, and up towards the surface in the annular space between drill string 58 and the wall of borehole 46.

At the surface the mud and cuttings leave the well through a side outlet in blowout preventer 99 and through mud return line (not shown). Blowout preventer 99 comprises a pressure control device and a rotary seal. The mud return line feeds the mud into separator (not shown) which separates the mud from the cuttings. From the separator, the mud is returned to mud pit 78 for storage and re-use.

Various sensors are placed on the drilling rig 10 to take measurement of the drilling equipment. In particular hookload is measured by hookload sensor 94 mounted on deadline 77, block position and the related block velocity are measured by block sensor 95 which is part of the draw works 70. Surface torque is measured by a sensor on the rotary table 88. Standpipe pressure is measured by pressure sensor 92, located on standpipe 86. Additional sensors may be used to detect whether the drill bit 54 is on bottom. Signals from these measurements are communicated to a central surface processor 96. In addition, mud pulses traveling up the drillstring are detected by pressure sensor 92. Pressure sensor 92 comprises a transducer that converts the mud pressure into electronic signals. The pressure sensor 92 is connected to surface processor 96 that converts the signal from the pressure signal into digital form, stores and demodulates the digital signal into useable MWD data. According to various embodiments described above, surface processor 96 is programmed to automatically detect the most likely rig state based on the various input channels described. Processor 96 is also programmed to carry out the automated event detection as described above. Processor 96 preferably transmits the rig state and/or event detection information to user interface system 97 which is designed to warn the drilling personnel of undesirable events and/or suggest activity to the drilling personnel to avoid undesirable events, as described above. In other embodiments, interface system 97 may output a status of drilling operations to a user, which may be a software application, a processor and/or the like, and the user may manage the drilling operations using the status.

Processor 96 may be further programmed, as described below, to interpret the data collected by the various sensors provided to provide an interpretation in terms of activities that may have occurred in producing the collected data. Such interpretation may be used to understand the activities of a driller, to automate particular tasks of a driller, and to provide training for drillers.

FIG. 2 shows further detail of processor 96, according to preferred embodiments of the invention. Processor 96 preferably consists of one or more central processing units 350, main memory 352, communications or I/O modules 354, graphics devices 356, a floating point accelerator 358, and mass storage such as tapes and discs 360. It should be noted that while processor 96 is illustrated as being part of the drill site apparatus, it may also be located, for example, in an exploration company data center or headquarters.

FIG. 3( a) is a screenshot of 11 data channels logged as part of a drilling operation and one data channel that is an interpretation of a subset of the 11 logged data channels. Channel 301 is a plot of the depth (DEPT) and horizontal depth (HDTH). Channel 303 is a plot of block position (BPOS). Channel 305 is a plot of block velocity (BVEL). Channel 307 is a plot of hook load (HKLD). Channel 309 is a plot of standpipe pressure (SPPA). Channel 311 is a plot of mud flow rate in (FLWI). Channel 313 is a plot of rotational speed (RPM). Channel 315 is a plot of surface torque (STOR). Channel 317 is a plot of rate of penetration (ROP). Channel 319 is a plot of a binary, value that indicates whether the bit is on bottom (BONB), and channel 321 is a plot of a binary value indicating whether the rig is “in slips” (SLIPSTAT).

FIG. 3( b) is a zooming in on a small section along the time-index of the screen shot from FIG. 3( a) thereby spreading out the data to show greater detail.

As described in U.S. Pat. Nos. 6,868,920 and 7,128,167,—which patents are commonly owned by the owner of the present application and are incorporated herein in their entirety by reference for all purposes, various sensor data, i.e., one or more of the data channels shown in FIG. 3, may be communicated via the communications modules 354 and may be interpreted to determine a rig state. The rig states, in one embodiment, may include the following states: DrillRot, DrillSlide, RihPumpRot, RihPump, Rih, PoohPumpRot, PoohPump, Pooh StaticPumpRot, StaticPump, Static, In slips, Unclassified. Where Rih=Run in Hole, Rot=Rotate, Pooh=Pull out of hole. These states correspond to a numerical value in the RIG channel that is also logged and depicted as channel 323 in FIG. 3.

Table I is a listing of RIG channel values and corresponding configurations:

TABLE I RIG Channel Values and Configurations Integer TRAITS Value Rig State Rotation Pumping Block Bottom Slips 0 Rotary Drill On On Slow On Bottom Not Slips 1 Slide Drill Off On Slow On Bottom Not Slips 2 In Slips — — — — In Slips 3 Ream On On Down Off Bottom Not Slips 4 Run In Off On Down Off Bottom Not Slips Pump 5 Run In On Off Down Off Bottom Not Slips Rotate 6 Run In Off Off Down Off Bottom Not Slips 7 Back Ream On On Up Off Bottom Not Slips 8 Pull Up Off On Up Off Bottom Not Slips Pump 9 Pull Up On Off Up Off Bottom Not Slips Rotate 10 Pull Up Off Off Up Off Bottom Not Slips 11 Rotate On On Stop Off Bottom Not Slips Pump 12 Pump Off On Stop Off Bottom Not Slips 13 Rotate On Off Stop Off Bottom Not Slips 14 Stationary Off Off Stop Off Bottom Not Slips 15 Un- — — — — — classified 16 Absent — — — — — 17 Data Gap — — — — —

In addition to the physical traits such as Rotation etc., the grammar of Appendix A defines traits for datagap, classified, and absent. These traits reflect the presence or consistency of the data. For example, a configuration that is not compatible with any of the first 15 data values, would be unclassified. Where data is missing for one index value in the data, the data would be absent, and if no data is recorded (including no index value), the data would be a datagap. For Rig States 0 through 14, these traits all have the values classified, not absent, and not datagap. Conversely, rig states 15 through 17 correspond to the conditions resulting in those particular states, e.g., for 15 unclassified, the trait values are unclassified, not absent, and not datagap.

A configuration is a particular combination of traits. The Rig channel is an assignment of a value corresponding to values collected from sensors that indicate a combination of traits corresponding to particular drilling conditions and operations. For this example, the traits are Rotation, Pumping, Block, Bottom, and Slips; Rotation signifies whether the drill string is rotating or not; Pumping signifies whether drilling mud is being pumped; Block indicates the direction of the block, i.e., up, down, slow or no movement; and Slips is reflects whether the drill string is in slips or not. Thus, a configuration is a particular combination of these trait values. Of course, given five variables, some of which take on several different values, the universe of configurations is rather large. However, some combinations of traits may not make sense. These nonsensical combinations are delegated to the Unclassified configuration. Drilling data may be collected on particular time intervals. As such, in some embodiments of the present invention, if for any given time index, data is recorded as NIL, the Absent value is assigned to the RIG channel. Similarly, if no data is recorded at all, the Data Gap value is assigned to the RIG channel.

While in one embodiment the invention may be used to interpret activities that correspond to values of the RIG channel, in other embodiments, other data values may be interpreted, either as combinations of data channels forming configurations in a similar manner to that discussed above for the RIG channel or for single channel data sets.

FIG. 4 is a high-level schematic illustration of a RIG data interpretation software program 401, advantageously stored on a mass storage device 360 of the computer system 96 or on an interpretation computer system not located at the rig site but, for example, having a similar architecture as computer system 96, that is operable to further interpret the log data collected during a drilling operation. In one embodiment, described herein, the further interpretation operates to further interpret the collected data to determine the activities that occur or have occurred during a drilling operation by analyzing a time sequence of the RIG channel 323. In an alternative embodiment additional data channels are used for determining the activities that occur or have occurred during a drilling operation.

The drilling data interpretation program 401 may accept as input a drilling knowledge base 403 and drilling data 405. The drilling data 405 may be drilling log data, for example, as depicted in FIGS. 3( a) and 3(b), a subset of the drilling log data, or, for example, the RIG_STATE channel 323. The drilling knowledgebase 403 is described in further detail below. As is discussed herein below, in an alternative embodiment, the drilling data interpretation program 401 may be constructed from the drilling knowledge in the drilling knowledgebase 403. It may, in accordance with an embodiment of the present invention, then be reused for interpreting subsequent data sets without accessing the knowledgebase 403.

The output of the drilling data interpretation program may be some form of interpretation 407 of the drilling data 405, e.g., a report of the activities that are occurring or have occurred during a drilling operation. The interpretation output 407 may be an interpretation of the input data using the knowledge contained in the knowledge base 403.

Embodiments of the present invention described herein, may be used on a variety of data channels and provide a variety of interpretations. Herein, merely for purposes of example, the interpretations that are made from the Rig Status channel 323 include four separate channels as illustrated in FIGS. 3( c) and 3(d). FIGS. 3( c) and 3(d) contain, in addition to a subset of the channels illustrated in FIGS. 3( a) and 3(b), four interpretation channel graphs containing curves for several interpretation probability variables (in italics in the table below):

-   -   1. (Channel graph 325) Type of Drilling Operation, i.e., whether         the rig is used for drilling rotary (drill rotary), drilling         sliding (drill sliding), or neither     -   2. (Channel graph 327) Actively making hole (Make Hole), wiping         the hole (Wipe Hole), or merely circulating mud by pumping         (Circulate)     -   3. (Channel graph 329) Actively drilling (drilling), adding         stand (Add Stand)     -   4. (Channel graph 331) Activity unknown (Unknown)

For each interpretation channel plot 325 through 331 there are logs for each of the interpretation probability variables. For example, considering graph 329, for most of the displayed section of Figure (c) the Drilling plot and the Add Stand plot behave essentially binary, e.g., there is a 1.0 probability of drilling at the same time as there is a 0.0 probability of adding stand. However, in the section near time-mark 23, the Add Stand plot indicates a probability of approximately 0.2-0.3 and, conversely, the Drilling plot indicates a probability of drilling of approximately 0.7-0.8. In other words, the plotted curves in graphs 325 through 331 indicate the probability of a particular activity.

Having described the input and the interpretation result, the methodology of interpreting the input data is now described, which methodology of interpretation may in some embodiments of the present invention take uncertainty into account and may produce the interpretation results.

FIG. 5 is a schematic illustration illustrating one embodiment of using drilling knowledge to interpret drilling data 405 to produce an interpretation 407. In the example of FIG. 5, the drilling knowledgebase 403 is used by an interpretation program code generator 507 to produce a data interpretation program 401.

The drilling knowledgebase 403 may be contained in a hierarchical structure 501 known as an ontology. A sample ontology is depicted and described in co-pending application to Bertrand du Castel et al., entitled “SYSTEM AND METHOD FOR AUTOMATING EXPLORATION OR PRODUCTION OF SUBTERRANEAN RESOURCES” filed contemporaneously with this application, commonly owned by the owner of the present application, and incorporated by reference herein for all purposes.

The ontology 501 may be input into an Ontology-to-Activity-Grammar program 503, the output of which is an activity grammar 505. In an alternative embodiment, the drilling knowledge is contained directly in an activity grammar 505. FIG. 6 is a screen dump of a portion of an ontology of drilling 501. A corresponding text version of the stochastic grammar ontology may be found in Appendix A—DRILLING STATES ONTOLOGY LISTING.

An Activity Grammar 505 contains, for example:

-   -   Activity descriptions for a number of activities wherein each         activity is described as a start state, a finish state, and one         or more subactivities performed during each activity. There may         be any number of levels of subactivities, i.e., a subactivity         may be further composed of other subactivities.     -   Transitional probabilities defining the probability of         transitioning from one subactivity to another subactivity, from         the start state to a particular subactivity, and from a         subactivity to the finish state.     -   A number of leaf activities. A leaf activity is an activity that         does not include any subactivities.     -   Configuration variables that define the configuration of an         activity with respect to particular traits, which are values of         particular observed conditions, e.g., whether the bit is on         bottom, whether the mud circulating pump is on or not, whether         block is moving, the direction it is moving, etc, wherein a         configuration is a combination of trait values.     -   Specification of a top level activity, e.g., The activity         drill_well corresponds to a defined activity that is composed of         several subactivities. drill_well is defined in the activity         grammar at lines A-1471 through A-1555. Without meta         information, drill_well. Would logically be the top level         activity. However, in the actual implementation, for         implementation reasons, the top level activity is a combination         of drill_well activity and the meta activity defined at A-1760         through A-1818.

Each of these elements of the stochastic grammar 505 is described herein below.

Activity Descriptions FIG. 7 is a schematic illustration of the activity drill_well represented by stochastic finite state machine 601. In a sense activities as defined in the activity grammar are probabilistic finite state machines, i.e., finite state machines in which transitions from one state to another have assigned probabilities. In FIG. 7 the activity finite state machine (AFSM) 601 corresponds to the activity grammar code (Appendix A, Lines A-1471 through A-1555). The drill_well AFSM 601 has a start state 603 and a finish state 605. From the start state 603, there are transitions to each of three sub-activities, namely, drill_a_section 611, trip_in 609, and trip_out 613, with transition probabilities 0.4, 0.4, and 0.2. These transitions and transitional probabilities are defined in the code at Lines A-1473 through A-1479, A-1480 through A-1486, and A-1501 through A-1507.

Drilling a section is defined as a continuous drilling operation that is terminated by an activity that does not fit within the grammar definition for the drill_a_section activity 611, see below. Therefore, at the conclusion of drill_a_section 611, or a sequence of drill_a_section activities, the AFSM drill_well transitions to the finish state 605.

Now consider the activity drill_a_section (Lines A-1557 through A-1607) illustrated in FIG. 8. The activity drill_a_section has several component activities, also known as subactivities. The subactivities of an activity are states in the finite state machine corresponding to the activity. Thus, there is a one-to-one mapping between activities and states. The drill_a_section activity 701 is depicted graphically in FIG. 8. The drill_a_section 701 is composed of the subactivities: pre_drill_stand 702, drill_a_stand 703, pre_add_stand 704 and add_a_stand 705 (in addition to the start 707 and finish states 709).

FIG. 9 is a schematic illustration of the drill_a_stand activity 703.

FIG. 10 is a schematic illustration of an activity with a repeating subactivity. The trip_in activity 711 includes one subactivity, the trip_in_stand subacitity 713 which may repeat up to 100 times.

FIG. 11 is a schematic illustration of a leaf activity, namely, the make_hole activity 715. The make_hole activity is defined in Appendix A at lines A-1188 through A-1213. The make_hole activity has not subactivities and the only transition is directly from its start state 717 to its finish state 719. In the context of the finite state machine representation of stochastic grammar, leaf activities are referred to as leaf states.

The example of Appendix A defines the following leaf activities:

TABLE II Leaf Activities from the Example of Appendix A lower_to_bottom run_into_hole lift_out_of_slips circulate wipe_up wipe_down lower_into_slips make_hole pull_out_of_hole in_slips connect_stand unclassified absent datagap unknown

Transitional Probabilities As noted above, each activity, other than leaf activities or bottom level activities, comprise one or more subactivities. The activity has specified transitional probabilities and a start and finish state. For example, the drill_well activity 601 defines transitions from trip_in 609 to trip_out 613 and drill_a_section 611. In the example of drill_well, the transitional probabilities from its start state 603 to drill_a_section is 0.4 and to trip_in 0.4. These probabilities represent probabilities that well drilling operation commences with drilling a section or tripping in, respectively. In some circumstances, drilling a well may start with a tripping out operation represented by a 0.2 probability transition from the start state to the trip_out subactivity 613.

As illustrated in FIGS. 7 through 10, activities correspond to finite state machines with transitions from the start state to the finish state through a sequence of subactivities. The activity grammar defines transitional probabilities for the transitions from the start state to these various subactivities, to one another, and to the finish state.

For example, lines A1473 through line A1542 define the transitional probabilities of the activity drill_well, corresponding to the transitional probabilities illustrated in FIG. 7 and discussed hereinabove.

Confizuration Variables and Leaf Activities The grammar has certain activities that do not have further subactivities; these are leaf activities. Associated with each leaf activity are values for certain traits. The traits may be defined in superactivities of the leaf activities and inherited by the leaf activities. A combination of trait values constitute a configuration that by definition have certain values when the leaf activity is being performed. The configuration variables, in a preferred embodiment, include pump, rotate (optionally), block, bottom, and slips.

Pump has the values on and off, and indicates whether the pump circulating drilling mud through the drillpipe is pumping (on) or not (off).

Rotate defines whether the drillstring is rotating or not.

Block indicates the movement of the block and has the values up, down, and stop (i.e., no movement).

Bottom indicates whether the bit is on the bottom of the borehole and has the values onbottom and offbottom.

Slips indicates whether the drillstring is inslips or notinslips.

Each leaf state is defined by particular values for each of the configuration variables. Configurations are particular combinations of trait values. For example, lines A1084 through A1109 defines that the activity circulate has the values pump=on, rotate=on, block=stop, bottom=offbottom, and slips=notslips. In other words, when the activity is circulate by definition the pump is pumping, the drill string is rotating, the block is not moving, the drillstring is off the bottom of the borehole and in slips.

In addition to the traits pump, rotate, block, bottom, and slips the ontology of Appendix A define several traits that are not directly associated with drilling operations, but rather with the data collected. These include classified, datagap and absent. Classified indicates that the trait combination recorded by the observed data translates to a datavalue in the RIG channel. I.e., if the combination of pump, rotate, block, bottom, and slips do not produce a RIG channel datavalue, the configuration is said to not be Classified. Datagap is used to signify a sequence of datapoints without recorded data. Absent indicates a missing data value.

Declaring configurations for the leaf activities specifies connections to the observations that lead to a conclusion that the drilling rig is operating according to that leaf activity. Thus, the system defines some configuration variables, namely pump, block, bottom, rotate and slips. These correspond to the data channels and correspond to the RIG STATE data channel. Furthermore, these define important variables that characterized into discrete cases, e.g., block is going down, pumping is off or on, we are either rotating or not, we are either on bottom or not on bottom, and in or not in slips. In an embodiment of the present invention, qualitative variables may be used that couple to the actual data. To decide whether the drilling process is pumping or not, in aspects of the present invention, a threshold above which it is deemed that the system is pumping is defined.

This threshold may be determined/analyzed/interpreted probabilistically. When looking at a measurement with a threshold, if far from the threshold there is a high certainty about the meaning of the data, e.g., high standpipe pressure above the determined pumping threshold means the probability of pumping in the system is high, whereas low'pipe pressure data below the pumping threshold means that the probability is that the pumping in the system is off. Pumping data around the threshold means the probability of pumping or not pumping is about fifty-fifty. As the pipe pressure rises the probability of pumping goes from zero, to fifty percent, to 100 percent.

The specific configuration variable values for each leaf state may be found in Appendix A, e.g., for make_hole, at A-1189 through A-1212, which defines that the configuration for make_hole is slips=notslips, pump=on, block=slow, bottom=onbottom; rotate is not specified.

Top-Level Activity The grammar 505 defines a top level activity from which certain operations of the generation of the data interpretation program 501 may commence. For example, determination of transitional probabilities from one leaf-state to another leaf-state is performed by traversing the grammar. That traversal begins at the top-level activity.

Returning now to FIG. 5. The ontology 501 contains a data representation of uncertainty in drilling knowledge. Uncertainty in drilling knowledge includes uncertainties in the manner in which one would interpret a particular data condition. Consider, for example, a knowledge that driller is drilling a section of a well and that in so doing the driller is drilling a stand of drill pipe, there is an uncertainty as to whether the driller will at the conclusion of that operation drill another stand or will have finished drilling a the section of the well. Experience may have shown that 90% of the time after drilling a stand another stand is added and 10% of the time the drilling of the section has finished. The activity grammar 505 contains this type of probability knowledge about the flow of drilling operations.

CODE GENERATOR 507 In one embodiment of the present invention, the code generator 507 accepts as input the activity grammar 505 (e.g., as listed in Appendix A) and produces the Data Interpretation Program 401 that when executed may be used to interpret the input data 405 and produce an interpretation 407 of the data in terms of the activities of the grammar 505. A sample code generator 507 written in the Java programming language is listed in Appendix B. This sample code generator accepts as input the grammar 505 that is represented in listing form in Appendix A.

FIG. 12 is a block diagram illustrating the components of the data interpretation program 401 generated by the data interpretation program generator 401. The data interpretation program 401 consists of three major components: a transition probability matrix (TRANS-PROB) 451 which is a matrix containing the probabilities of transitioning from one activity (i.e., one state) to another, a data-to-state probability matrix (DATA-STATE-PROB) 453 which is a matrix containing the probabilities that a given data value corresponds to each particular state, and the program code 455 that applies the TRANS-PROB matrix 451 and the DATA-STATE-PROB matrix 453 to compute an interpretation of the input data in terms of probabilities that the input data corresponds to particular activities defined in the knowledge base 403.

The mechanism for building the TRANS-PROB matrix 451 and the DATA-STATE-PROB matrix 453 is described herein below. Before discussing how the code generator 507 builds these matrices we describe the operation of the code 455 that applies these matrices to interpret input data, e.g., a RIG states channel.

FIG. 13 is a block diagram illustrating the process of interpreting a time sequence of input data, e.g., the RIG State channel 323, to compute probability values for each leaf state and for each activity defined by the activity grammar 505, in accordance with an embodiment of the present invention. FIG. 13 illustrates an example of the operation of code 455. In the illustrated embodiment, the input is the sequence to be interpreted, e.g., the RIG states channel data, and the grammar data structure, e.g., the grammar data structure 511.

Consider the interpretation of a data value Data at time T 200, and the probabilities of the various states at time T−1 201. The input state probabilities vector P(S_(T-1)) indicates the probability of each leaf activity is the leaf activity occurring at time T−1. Considering the example of Appendix A, there are fifteen leaf activities defined. The P(S_(T-1)) therefore has 15 elements, each indicating the probability that one of the leaf activities is occurring at T−1.

The P(S_(T-1)) is matrix-to-vector multiplied 157 with the TRANS-PROB matrix to determine the probability of each leaf state given the probabilities of transitioning from that leaf state to each other state, i.e., P(S_(T)|S_(T-1)). The construction of the TRANS-PROB matrix is described herein below.

The matrix-to-vector multiplication 157 produces a prior state probabilities vector P(S_(T)) 205 in which each element represents the probability that the corresponding leaf state would occur given the state probability vector at T−1. As is discussed herein below, the TRANS-PROB is derived from the transitional probabilities in the grammar 505 and the grammar structure itself. Thus, P(S_(T-1)) 205 reflects only the transitional probabilities resulting from the grammar without taking the input data Data 200 into account. In Bayesian inference, a prior probability distribution, often called simply the prior, is a probability distribution representing knowledge or belief about an unknown quantity a priori, that is, before any data have been observed P(A).

The prior probability vector P(S_(T)) 205 is adjusted by the probabilities that the data reflects each particular leaf activity P(S_(T)|Data) 207. That task is performed by extracting 211 the vector of probability values corresponding to the Data value 200 in the DATA-STATE-PROB matrix 453. The DATA-STATE-PROB matrix 453 contains the probability value of each leaf activity given a particular data value. The computation of the DATA-STATE-PROB matrix 453 is provided herein below.

The prior probability vector P(S_(T)) 205 is adjusted by the probabilities that the data reflects each particular leaf activity P(S_(T)|Data) 207 by an element-by-element multiplication 161 of each element in the prior probability vector P(S_(T)) 205 by the corresponding element in the data-to-state probability vector P(S_(T)|Data) 207 and normalizing 167 the result thereby obtaining the posterior state probabilities at time T P(S_(T)|Data) 209. Thus, the posterior state probabilities at time T P(S_(T)|Data) 209 take into account both the stochastic grammar 505 and the data values from the data channel.

FIG. 14 is a block diagram illustrating exemplary pseudo code implementing the process illustrated and discussed in conjunction with FIG. 13. A first step may be to clean up the input data, step 131. The data may be cleaned up to provide for missing data, to remove spikes indicative of nonsense/non-probabilistically relevant data and/or the like. In certain aspects, the nonsense/non-probabilistically relevant data may be treated as missing data. In an embodiment of the present invention, the program may interpolate for missing data values.

The pseudo code of FIG. 14 operates to take a current state probability vector (computed at T−1) as an input for the processing of each data value in the data sequence at time T and from it, together with the data value, compute a new current state probability vector reflecting the data value at time T−1. For each iteration, the current state vector from the previous iteration (each iteration reflecting the processing of a data value in the sequence of data values) is updated. Thus, a first step may be to initialize the array holding the current state vector (CURRENT-STATE-VECTOR), step 135. The initialization may be to give each state the same probability, e.g., if there are 14 different possible states, each state would be given the probability of 1/14=0.0714. An alternative approach is to use a statistical distribution of states from historical data as the basis for the initial probability distribution.

Next, the pseudo code includes a loop iterating over the sequence of data samples to be processed, loop 137, to update the CURRENT-STATE-VECTOR. First, the state probability vector (TRANSITION-PROB-VECT) is computed, step 139. Step 139 is fleshed out in greater detail in FIG. 15. As discussed in conjunction with FIG. 13, step 157, step 139 is a vector-matrix multiplication operation between the CURRENT-STATE-VECTOR and TRANS-PROB matrix. For each state i in the CURRENT-STATE-VECTOR (i.e., at T−1), outer loop 141, the sum of the probability that a each possible state j is followed by the state i is calculated using an inner loop 143 that is an iteration over the possible states j by looking up the probability value that the state j is followed by the state i in the TRANS-PROB matrix, step 145. The computed vector TRANSITION-PROB-VECT is the “Prior” States Probabilities 205 of FIG. 13.

Returning to FIG. 14. The processing of a data value in the sequence being processed also includes the computation of the probability vector having values that the Data value at time T corresponds to each possible activity based on the data on the data value, i.e., the Data-to-State Probabilities vector 207 of FIG. 13, step 181 corresponding to operation 211 of FIG. 13. This computation may be a look up operation in the DATA-TO-STATE probability matrix to determine for each possible state the probability that the Data value corresponds to that state.

It should be noted that steps 139 and 181 are independent of one another and may be computed in parallel or in any sequence.

The prior probabilities (TRANSITION-PROB-VECT) 205 are combined with the Data-to-State Probabilities vector 207 by multiplying each value in the prior probabilities vector to the corresponding value in the Data-to-State Probabilities vector, step 183.

In an embodiment of the present invention, having computed the Leaf State v. Rig State probability matrix, step 153, the interpretation/parse program loops over the sequence of data samples in the input data 405, loop 155 may be determined. For each sample in the data channel, time-step by time-step, the interpretation/parse program may be performed (this process is illustrated in FIG. 14).

At the beginning of each sample, there is a probability of being in each state from the previous sample (the initial condition being either that the rig is in the unknown state, or that the probability is equal for all states, step 154). In FIG. 14 these state probabilities are contained in a state probability vector 201. In an embodiment of the present invention, the transitional probability matrix 203 may be applied to all these state probabilities, step 157. This is a matrix multiplication operation. The application of the state transitional probabilities produces a prior state probability vector 205 (“prior” in the sense that it is computed solely from the previous state probability vector 201 and the transitional probability matrix 203).

The details of the Interpretation Program Code Generator 507 are now described. As noted above the Interpretation Program 401 contains the TRANS-PROB matrix 451 and DATA-STATE-PROB matrix 453. The Interpretation Program Code Generator 507 produces these two matrices from the grammar 505 as is illustrated in FIG. 16.

The following pseudo code describes the process of creating the TRANS-PROB matrix 451:

TABLE III Pseudo Code describing how to build the TRANS-PROB matrix Build TRANS_PROB matrix {   Determine_Leaf_Nodes (START) {Determine Leaf Nodes of the   grammar starting with START (see below)}   Build Trans-Prob Matrix from leaf_nodes {rows = {START,   leaf_nodes); columns {leaf_nodes, FINISH}}   Traverse_to_collect_probabilities by following paths from   each leaf-node to each other leaf-node }

The first step is to determine the leaf nodes. As discussed herein above, the leaf nodes are those nodes that have no subactivity states. The matrix may thus merely be traversed until a node has no transition out. FIG. 17 is a very simple grammar used as an example to illustrate (this grammar is used herein below to illustrate the operation of the system and method for interpreting drilling data using a stochastic grammar. Beginning at the START state, the grammar is traversed collecting the nodes that have no path other than to finish. Thus, there is a path A-B-D and D to Finish. Therefore one leaf node is the A-B-D node, representing the transitions from Start to Finish via the nodes A, B, and D. Similarly, further traversal of the grammar structure of FIG. 17 determines that nodes A-C and A-B-E are leaf nodes.

Next, the TRANS-PROB matrix is constructed to have a row and column for each leaf state, an additional row for the START state, and an additional column for the FINISH state. Such a TRANS-PROB matrix 231 that corresponds to the grammar of FIG. 17 is illustrated in FIG. 18.

Next, the TRANS-PROB matrix 231 is populated by traversing the grammar following the transitions from START to leaf-states and multiplying together the transition probabilities. In the example, the path from START to A-C to FINISH has the transitions Start→A with a probability 1.0, A→C with a probability 0.6, and C→FINISH with a probability 1.0. Thus, the START to A-C state-to-state transition probability is 1.0*0.6*1.0=0.6. Similarly, from START to A-B-D to FINISH has the transition probabilities 1.0, 0.4, 0.3, and 1.0 for a state-to-state transition probability of 0.12, and so on. Of note is the transition back from node A-B-E onto itself with a 0.5 probability. In the traversal of the grammar to determine the transitional probabilities from one node to another, if a transition causes a visit to a node that has previously been visited in the determination from that one node to that another node, the traversal stops and the product of the transitional probabilities encountered along the path is noted. In this particular example, there is only the transition from A-B-E onto itself with a transitional probability of 0.5. A complete leaf-state-to-leaf-state traversal that multiplies all the transitional probabilities in the path from each leaf-state that can reach each other leaf-state results in the TRANS-PROB matrix, e.g., for the grammar example of FIG. 17, into the matrix 231 of FIG. 18.

The process for building the interpretation program 401, e.g., the interpretation program code generator 507, also computes the DATA-STATE-PROB probability matrix 453. The following pseudo code describes, one possible process of creating the DATA-STATE-PROB probability matrix 453:

TABLE IV Pseudo Code describing how to build the DATA-STATE-PROB matrix without taking confusion into account Build Data-to-state matrix NOT taking confusion into account 1 { 2 3 for each state 4 { 5   determine the configurations compatible with that state (state-compatible- configurations) and number of state compatible configuration (state-compatible- configurations-count); /* i.e., if a state is defined by config 11--, the compatible configurations are 1100, 1101, 1110, and 1111. Thus there are 4 compatible configurations */  6   state-per-configuration-probability := 1 / state-compatible-configurations-   count; 7   for each state-compatible-configuration 8   { 9     for each data-value 10     {  11         if the data value is compatible with the state-compatible-     configuration then 12        {  13           note the data value as compatible (e.g., set a bit       corresponding to that datavalue and configuration;  14           increment count of data value-configuration       pairing as compatible for this state (data-value-compatible-       count); 15        } /* end if data value is compatible 16     } /* end for each data value */ 17     for each compatible-datavalue 18     {  19         DATA-STATE-PROB [compatible-datavalue,state] :=      DATA-STATE-PROB [compatible-datavalue,state] + state-per-      configuration-probability DIVIDED BY data-value-compatible-      count; 20     } /* end for each compatible datavalue */ 21   } /* end for each state-compatible-configuration */ 22 }  /* end for each state */ 23 normalize each datavalue row;  25  }

The process iterates over the leaf-states defined in the grammar. In the present example, the leaf states are A, B, C, and D.

For each state, first there is a determination of which states are compatible with particular data values based on common traits, Loop Lines 3 through 22. FIGS. 19 and 20 illustrate the operation of matching compatible states and traits. Consider a very simple grammar 801 of FIG. 19. The grammar 801 has four leaf states: A, B, C, and D. The transitions defined by the grammar 801 are not specified. However, let's stipulate that the grammar defines four traits: TRAIT1, TRAIT2, TRAIT3, and TRAIT4. The values for the configurations of these traits for the four states are given in the TRAITS-TO-STATE table of FIG. 20. Note that similar configurations are given for the leaf-states defined in the grammar of Appendix A. For illustrative purposes, the traits in FIG. 19 are binary. Thus, for each trait, a configuration corresponding to a particular state may have the value undefined (-), 1, or 0. For State A, the configuration is --1 0, etc. Thus, since the undefined (-) values may take any value, the compatible configurations for state A are 0010, 0110, 1010, and 1110. FIG. 21 illustrates the compatible configurations for the states in the present example.

Similarly, configurations, i.e., combination of trait values are assigned to the various data values. For example, in the example of Appendix A, the token Run_In (Appendix A, Lines A269 through A304), corresponding to the RIG channel value 6, has the defined configuration classified=yes, absent=no, rotate=off, block=down, bottom=offbottom, pump=off, slips=notslips, and datagap=np. All other possible data values also have defined configurations.

In the simplified example presented here, there are six data values provided, 1 through 6. FIG. 22 provides a table of the configurations corresponding to these six data values. For example, data value 1 has the configuration 0 0--.

These configurations may also be expanded into compatible configurations like the configurations corresponding to the various leaf states. FIG. 23 is a table of the compatible configurations corresponding to the defined configurations of FIG. 22.

The compatible are referred to in the pseudo code of Table IV as state-compatible-configurations and the count of such configurations, as state-compatible-configurations-count.

Having determined the compatible configurations, the process assigns the total probability for the state over those compatible configurations by simply taking the inverse of the state-compatible-configurations-count, Line 6.

The process iterates over all the state-compatible-configurations for the state of the current outer loop iteration, Loop starting Line 7 and ending Line 21 to determine the data values (innermost nested loop: Lines 9 through 16) that have a configuration that matches the compatible configurations. For any data value that is compatible with the state configuration (If statement Line 11), the data value is noted as compatible (Line 13) and a count of compatible data value-to-state-configuration pairings is incremented (Line 14). FIG. 24 is an illustration of the notation of compatible configurations for each state. Note that the configurations for each state are numbered sequentially and that each element in the matrix of FIG. 24 is a sequence of flags that reflect whether, the particular corresponding configuration of the state is compatible with the configuration of the data value. For example, because the configuration of state A is --10 and has compatible configurations 0010, 0110, 1010, 1111 and data value 1 has the configuration 00-- with the compatible configurations 0000, 0001, 0010, 0011, the first compatible configuration of state A is the only compatible configuration that is compatible with the configurations of Data value 1.

After the conclusion of the loop over compatible configurations, the process knows which datavalues are compatible with the state'(e.g., have been noted as compatible) and how many such compatible states there are, data-value-compatible-count. That information is used to populate the DATA-STATE-PROB matrix 453. For each data value that is noted as compatible, the DATA-STATE-PROB [datavalue, state] matrix element is set to number of compatible configurations for that data value, state combination divided by the total number of compatible configurations for the state, Lines 17-21. FIG. 25 is an illustration of the result of the loop of Lines 17-21. Consider, for example, the column for State B. For data value 2, there are 4 compatible configurations, for each of data values 3 through 6 there are 2 compatible configurations. Thus, for the entire column there are a total of 12 matches of compatible configurations. Dividing the matching configurations for each data value with the total number of matching configurations results in the values 1/3, 1/6, 1/6, 1/6, and 1/6 for rows corresponding to data values 2 through 6.

Finally, the DATA-STATE-PROB matrix is normalized along the rows, Line 23. FIG. 26 is an illustration of the resulting matrix following the normalization operation.

The example grammar 801 of FIG. 19 does not provide specific transition rules. For the purpose of illustrating the operation of the data interpretation program, let's consider an example TRANS-PROB matrix with arbitrarily selected probability numbers. FIG. 27 provides such an example that will be used herein below for the purposes of illustrating the operation of the interpretation program 401.

FIG. 28 is an illustration of the results of the operation of the interpretation program 401 using the process described in conjunction with FIGS. 13 though 15 and the TRANS-PROB matrix of FIG. 27 and the DATA-STATE-PROB matrix of FIG. 26. Row 803 represents a sequence of data values. Column 805 is an initial value for the CURRENT-STATE-PROBABILITY VECTOR; or it may be viewed as an interim vector in the processing of some larger sequence that is immediately followed by the vector 803.

Table 205′ is the prior probabilities. Thus, the first column are the prior probabilities obtained from a vector-to-matrix multiplication of the initial vector 805 and the TRANS-PROB matrix of FIG. 27 as discussed in conjunction with elements 157 of FIG. 13, 139 of FIGS. 14 and 15. Table 207′ contains the Data-to-State vectors corresponding to the respective data values in the input data sequence 803. For example, the first, third, and sixth data values are the value 5. The value 5 has the data-to-state vector [0, 0.3, 0.1, 0.3, 0.1, 0]. Thus, that vector is recorded in columns 1, 3, and 6 of table 207′. Having the vectors 205 and 209, corresponding to a data value, these are element-by-element multiplied (operation 161) and normalized (operation 167) to produce the vector corresponding to the same data value in the output table 209′.

While the present example discussed herein above relies on a very simplified grammar, the same techniques may be used for a more complex grammar 505. Appendix A illustrates such a grammar. Appendix B is an example Java program implementation of an interpretation program code generator 507 operating on, for example, the activity grammar 505 that has been extracted into the representation of Appendix A.

It is entirely possible that a recorded data value is inaccurate. Consider an unrelated example. Consider two drivers following one another. The trailing driver wishes to use the turn signal of the car in front to determine the actions of the first driver. Usually the turn signal coming on is a good predictor of the intent of the driver to turn. However, a missing turn signal may only mean that the light is out. Even a blinking turn signal may not indicate that the driver intends to turn. The blinking turn signal could be indicative of a faulty circuit or that the driver mistakenly engaged the turn signal (or that the eyes of the person in the trailing car is hallucinating). Thus, there is some confusion about what the observed data really means.

The same phenomena may occur in a drilling operation. For example, a RIG state indicative of the rig being in slips usually would mean that the rig is indeed in slips. However, it could also mean that there was an error in recording the rig as being in slips. Such errors may occur, for example, by sensors failing, sensor calibration being off, or some anomalous condition that caused a sensor to operate erratically.

An embodiment of the present invention accounts for such uncertainties, also known as confusion, by recording the confusion as to the meaning of a trait value in a confusion matrix mapping recorded values to actual values according to the probability that the recorded value accurately reflects the actual value. FIG. 29 illustrates some confusion matrices that could correspond to the example given herein above. The first trait confusion matrix 901, for example, corresponds to the Trait #1 and indicates that a recorded value of ON is 0.98 indicative of the actual value being ON and 0.02 indicative of the actual value being OFF. Similarly of the other traits. Note that the third and fourth confusion matrices 903 and 905 indicate that there is no confusion as to these values.

It is valuable to note that the confusion matrices are not necessarily symmetrical. The example, with the turn signal would probably yield a similar dissymmetry, i.e., it is more likely that the turn signal being on means an imminent turn than that the turn signal being off means that no turn will be made.

The following pseudo code describes the process of creating the DATA-STATE-PROB matrix 453 using the Confusion Matrices:

TABLE V Pseudo Code describing how to build the DATA-STATE-PROB matrix using confusion matrices Build Data-to-state matrix taking confusion into account 1 { 2 3 for each state 4 { 5   determine the configurations compatible with that state (state-compatible-   configurations) and number of state compatible configuration (state-   compatible-configurations-count); /* i.e., if a state is defined by config 11--,   the compatible configurations are 1100, 1101, 1110, and 1111. Thus there are   4 compatible configurations */ 6   state-per-configuration-probability := 1 / state-compatible-configurations-   count; 7   for each state-compatible-configuration 8   { 9     for each data-value in the list of data-values 10     { 11         if the data value is compatible with the state-compatible-     configuration then 12       { 13           note the data value as compatible (e.g., set a bit       corresponding to that datavalue and configuration; 14           increment count of data value-configuration       pairing as compatible for this state (data-value-compatible-       count); 15       } /* end if data value is compatible 16     }; /* end for each data value */ 17     if the state-compatible-configuration does NOT have confusion 18       for each compatible-datavalue 19           DATA-STATE-PROB [compatible-       datavalue,state] := DATA-STATE-PROB [compatible-       datavalue,state] + state-per-configuration-probability       DIVIDED BY data-value-compatible-count 20     else /* the state-compatible-configuration does have confusion */ 21           for each confusion-alternative-configuration /*      e.g., if config is 1 0 and can be confused with 1 1 the iteration is      over the set 1 0 and 1 1 . */ 22           { 21             Determine the alternative-config-       probability for the confusion-alternative-configuration from the       confusion matrix; 22             For each alternative-data-value IN the       list of data-values that is compatible with the confusion-       alternative-configuration 23               DATA-STATE-PROB        [alternative-data-value,state] := DATA-STATE-PROB        [alternative-data-value,state] + alternative-config-        probability* state-per-configuration-probability DIVIDED        BY data-value-compatible-count; 24           } /* end for each confusion-alternative-           configuration */ 25       /* ENDIF */ 26   } /* end for each state-compatible-configuration */ 27 }  /* end for each state */ 28 normalize each datavalue row; 29 }

The above pseudo code will be described herein by way of example. The pseudo code of Table V loops over each state (Loop starting at Line 3). The pseudo-code of Table V operates much like pseudo code of Table IV. For any configuration that is compatible with the state and for which there is no confusion, the assignment of probability is the same. However, if there is confusion in a compatible configuration, the probability associated with that configuration is allocated between the alternative configurations that could reflect the recorded configuration and to the datavalues that such alternative configurations are compatible with according to the probabilities assigned in the confusion matrices.

Consider a very simple example. If a first state S has a defined configuration as 1-, i.e., the first bit is 1 and the second bit is undefined, there are two configurations that are compatible with that configuration, 1 0 and 1 1. Now, suppose that there are three alternative data values, A, B, and C. Let's define 1 0 to be compatible with A and B, and 1 1, with B and C. Let's further define that the first compatible configuration, 1 0, has no confusion, whereas 1 1 may be confused and has alternative configurations 1 0 and 1 1. According to the confusion matrix, 1 1 has the probability 0.8 of begin 1 1 and the probability 0.2 of being 1 0.

Because there are two compatible configurations for state S, each is allocated a probability of 0.5.

Consider now the first compatible configuration of the state S, 1 0. Because it has no confusion, of the data values compatible with 1 0, namely A and B, are allocated A of the 0.5 probability allocated to 1 0.

The resulting DATA-TO-STATE probability matrix for state S is as follows:

A: 0.25

B: 0.25

C: 0.0

Now consider the second compatible configuration of state S, 1 1. Because 1 1 has confusion, Line 20 of the pseudo code of Table V, for each alternative (1 1 and 1 0), the probability of that configuration is determined from the confusion matrix. The confusion matrix has a row for each recorded value of a trait and a column for each actual value. In the present example, the only recorded value is 1 and the corresponding actual values may be either 1 (with a probability 0.8) and 0 (with a probability 0.2). Thus, the two alternative configurations are given the probabilities 0.8 and 0.2, respectively, Line 22. For each configuration that is an alternative to the confused configuration each compatible data value the probability assigned to the alternative configuration multiplied by the portion of the probability assigned to the state compatible configuration that is assigned to each data value, Line 23. Because there are two data values compatible with the second state compatible configuration each is allocated 0.25. This is then multiplied by then allocated to the data values compatible with the alternative configurations as follows:

A: 0.2*0.25 (from being compatible with 1 0 which is 0.2 probability alternative of 1 1)

B: 0.8*25+0.2*0.25 (from being compatible with 1 1 which is 0.8 probabilty alternative of 1 1, and being compatible with 1 0 which is 0.2 probability alternative of 1 1)

C: 0.8*0.25 (from being compatible with 1 1 which is 0.8 probabilty alternative of 1 1)

Thus, the end-result allocation of data-to-state probabilities for state S is:

A: 0.25+0.2*0.25

B: 0.25+0.8*25+0.2*0.25

C: 0.0+0.8*0.25

The methodology for storing a stochastic grammar 505 in an ontology for drilling 501 and using that in the manner described for interpreting a data stream 405 may be extended. In an embodiment, the above-described methodology is used to assess the compatibility of a data set with a particular grammar and thereby determining something about the data set. For example, each operator company may have its own way of performing drilling operations and may handle particular situations in particular ways. Each company would then have a unique grammar. Similarly, different geographic regions may have different grammars. A data set, for which an analyst does not know the origin, be it by operator-company or by geographic region, may be interpreted against several alternative grammars to determine which grammar is the best fit and therefore most likely to be the origin of the data set. FIG. 30 is a flow-chart illustrating the process of determining the origin of a data set.

A data set 251, e.g., a RIG state channel or another data channel, is received as input. A plurality of hypothesis 255 a through 255 d are started, step 253. Each hypothesis 253 may be a data interpretation program 401 that implements a unique stochastic grammar reflecting the operations of a particular drilling operator or geological area. These hypothesis data interpretation programs 255 each iterate 257 over the data sequence 251 in the manner described herein above in conjunction with, for example, the interpretation program 401. On each iteration, the hypothesis interpretation programs 255 determine state probability vector corresponding to an interpretation of the data set using the grammar associated with that particular hypothesis.

Each hypothesis may test the state probability vector it generates against some criteria to determine whether the hypothesis is plausible, decision 261. Usually, if a data set reflects activities that may be interpreted by a particular grammar the state probability vector would strongly indicate that certain activities are much more probable than the other activities. Conversely, if all activities are roughly equally probable, there is a very poor match between the grammar and the data set. Thus, if the grammar seem ill-suited over several iterations, the hypothesis is aborted, step 263, otherwise, the next point in the sequence is processed, step 265. At the conclusion of the processing of the data set through various hypothesis, the interpretation results may be reported, step 267, including reporting the best overall match between the data set 251 and the grammars processed by the various hypothesis interpretation programs 255.

The particular embodiments disclosed above are illustrative only, as the invention may be modified and practiced in different but equivalent manners apparent to those skilled in the art having the benefit of the teachings herein. Furthermore, no limitations are intended to the details of construction or design herein shown, other than as described in the claims below. It is therefore evident that the particular embodiments disclosed above may be altered or modified and all such variations are considered within the scope and spirit of the invention. In particular, every range of values (of the form, “from about A to about B,” or, equivalently, “from approximately A to B,” or, equivalently, “from approximately A-B”) disclosed herein is to be understood as referring to the power set (the set of all subsets) of the respective range of values. Accordingly, the protection sought herein is as set forth in the claims below. 

1. A method of processing oilfield process status data in a data channel, comprising; representing oilfield process knowledge in a knowledge representation, the knowledge representation including representations of uncertainty values in the oilfield process knowledge; and generating an interpretation of the oilfield process status data from the oilfield process knowledge representation.
 2. The method of claim 1, wherein: the step of representing oilfield process knowledge in a knowledge representation comprises representing oilfield process knowledge in a knowledge representation of activities and transitions between subactivities of activities, including representing probabilities for transitions between subactivities of activities; and the step of generating an interpretation comprises computing activity probability values corresponding to each of a set of activities from transitional probabilities and data values in the data channel.
 3. The method of claim 2, wherein the step of computing activity probability values comprises; determining detailed activities corresponding to each possible subactivity; computing activity-to-data probability values for each ordered pair (a,d) of detailed activity (a) and data channel value (d), wherein for the given the activity (a), the data channel would indicate the data channel value (d); for each data sample (n) in the data channel computing an activity probability vector (n) for a set of activities, wherein the probability value for each activity in the set of activities indicates the probability that the drilling rig carried out the each activity at time n, wherein each of the activity probability vectors (205) is computed by applying the transition probabilities (203) to an activity vector (n−1) (201); selecting an activity-to-data probability vector (207) from a computed activity-to-data probability values, wherein the elements of the activity-to-data probability vector correspond to the activities in the set of activities and the data sample (n); and computing an output probability vector (209) by applying the activity-to-data probability vector (207) to the first probability vector (205) and by applying Bayes Theorem to the result.
 4. A method of interpreting oilfield process state data, comprising: representing oilfield process knowledge in an activity grammar containing activity states; transitional probabilities for transitioning from one activity to another; configuration variables; and leaf activity with assigned values for each configuration variable; and interpreting input data using the activity grammar to compute probabilities for each of a set of activities defined as possible activities of an oilfield process state.
 5. A method of processing drilling rig data in a data channel, comprising; representing drilling knowledge in a knowledge representation including representing uncertainty values in the drilling knowledge; and generating an interpretation of the drilling rig data from the drilling knowledge representation including the representation of uncertainty values.
 6. The method of processing drilling rig data in the data channel of claim 5, wherein: the representing comprises: representing drilling knowledge in a knowledge representation of activities and transitions between subactivities of activities, including representing probabilities for transitions between subactivities of activities; and the generating an interpretation comprises: computing activity probability values corresponding to each of a set of activities from the transitional probabilities and data values in the data channel.
 7. The method of interpreting drilling rig data in a data channel of claim 6, wherein the step of computing activity probability values comprises; determining detailed activities corresponding to each possible subactivity; computing activity-to-data probability values for each ordered pair (a,d) of detailed activity (a) and data channel value (d) that given the activity (a), the data channel would indicate the data channel value (d); for each data sample (n) in the data channel computing an activity probability vector (n) for a set of activities, wherein the probability value for a each activity in the set indicates the probability that the drilling rig carried out the each activity at time n, by: computing a first probability vector (205) by applying the transition probabilities (203) to the activity vector (n−1) (201); selecting an activity-to-data probability vector (207) from the computed activity-to-data probability values, wherein the elements of the activity-to-data probability vector correspond to the activities in the set of activities and the data sample (n); and computing an output probability vector (209) by applying the activity-to-data probability vector (207) to the first probability vector (205) and by applying Bayes Theorem to the result.
 8. A method of interpreting drilling rig data, comprising: representing drilling knowledge in an activity grammar containing: activity states; transitional probabilities for transitioning from one activity to another; configuration variables; leaf activity with assigned values for each configuration variable; and interpreting input data using the activity grammar to compute probabilities for each of a set of activities defined as possible activities of a drilling rig.
 9. The method of interpreting drilling rig data of claim 8, wherein the representing drilling knowledge in an activity grammar further comprises: for each activity that is not a leaf activity: a start state and a finish state; at least one subactivity; at least one transition from the start state to at least one subactivity; and at least one transition from at least one subactivity to the finish state.
 10. A computer system for interpreting drilling data comprising: sensors for collecting drilling data; a storage for storing data and program instructions; a processor having a data input/output mechanism and operable to input data on the input/output mechanism and to output data onto the input/output mechanism, and to process data according to instructions stored in the program storage; wherein the storage contains: a representation of drilling knowledge including representation of uncertainty values in the drilling knowledge; and instructions that when executed causes the processor to generate an interpretation of the drilling rig data from the drilling knowledge representation including the representation of uncertainty values.
 11. The computer system for interpreting drilling data of claim 10, wherein: the representation of drilling knowledge comprises a knowledge representation of activities and transitions between subactivities of activities, including representing probabilities for transitions between subactivities of activities; and the instructions to generate an interpretation comprises instructions to cause the processor to: compute activity probability values corresponding to each of a set of activities from the transitional probabilities and data values in the data channel.
 12. The computer system for interpreting drilling data according to claim 11, wherein the instructions to cause the processor to compute activity probability values comprises instructions to cause the processor to: determine detailed activities corresponding to each possible subactivity; compute activity-to-data probability values for each ordered pair (a,d) of detailed activity (a) and data channel value (d) that given the activity (a), the data channel would indicate the data channel value (d); for each data sample (n) in the data channel compute an activity probability vector (n) for a set of activities, wherein the probability value for a each activity in the set indicates the probability that the drilling rig carried out the each activity at time n, by: computing a first probability vector (205) by applying the transition probabilities (203) to the activity vector (n−1) (201); selecting an activity-to-data probability vector (207) from the computed activity-to-data probability values, wherein the elements of the activity-to-data probability vector correspond to the activities in the set of activities and the data sample (n); computing an output probability vector (209) by applying the activity-to-data probability vector (207) to the first probability vector (205) and normalizing the result.
 13. A method for determining probabilities of data collected during exploration or production of subterranean resources corresponding to particular activities, comprising: receiving a sequence of data values from a subterranean resource exploration operation; and determining the probability that the data values correspond to each of several activity states using a function derived from a knowledge representation containing probabilities of transitioning from one activity state to each other of the several activity states, and a function providing a probability mapping of particular data values to possible activity states.
 14. The method for determining probabilities of data collected during exploration or production of subterranean resources corresponding to particular activities of claim 13, wherein the function derived from a knowledge representation containing probabilities of transitioning from one activity state to each other of the several activity states is a transition probability matrix in which each element is the probability of transitioning from one activity state to another activity state.
 15. The method for determining probabilities of data collected during exploration or production of subterranean resources corresponding to particular activities of claim 14 further comprising: adjusting the function providing a probability mapping of particular data values to possible activity states to account for confusion in regard to whether particular data values correspond to actual conditions.
 16. The method for determining probabilities of data collected during exploration or production of subterranean resources corresponding to particular activities of claim 13 wherein the knowledge representation containing probabilities of transitioning from one activity state to each other of the several activity states is a representation of a stochastic grammar having rules describing possible transitions in a sequence of activities that may correspond to the sequence of oilfield data values.
 17. The method for determining probabilities of data collected during exploration or production of subterranean resources corresponding to particular activities of claim 13, wherein the function providing a probability mapping of particular data values to possible activity states is a data-to-activity state probability matrix in which each element is the probability that a given data value corresponds to a particular activity state.
 18. The method for determining probabilities of data collected during exploration or production of subterranean resources corresponding to particular activities of claim 13 further comprising: determining a first vector of probability values corresponding to a particular data item in the sequence of oilfield data values by applying the function derived from a knowledge representation containing probabilities of transitioning from one activity state to each other of the several activity states to a preceding vector of probability values corresponding to a data item preceding the particular data item to determine the probability of each of the several activity states given the probability in the preceding vector of probability values and the probabilities of transitioning from one activity state to each other of the several activity states; determining a second vector of probability values corresponding to a particular data item in the sequence of oilfield data values by applying the function providing a probability mapping of particular data values to possible activity states to the particular data item in the sequence of oilfield data values; and determining a probability vector for the particular data item by combining the first vector of probability values and the second vector of probability values.
 19. A method of evaluating alternative hypothesis in regard to origin of data collected during exploration or production of subterranean resources, comprising: receiving a sequence of data values from a subterranean resource exploration operation; for each of a plurality hypothesis, determining the probability that the data values correspond to each of several activity states using: a first function providing probabilities for transitioning from each of the several activity states to each other of the several activity states wherein the first function is derived from a knowledge representation containing probabilities of transitioning from one activity state to each other of the several activity states according to rules specified for the each of a plurality of hypothesis, and a second function providing a probability mapping of particular data values to possible activity states wherein the second function is derived from a knowledge representation containing hypothesis specific mapping of traits to activity states and traits to data values according to rules specified for the each of a plurality of hypothesis; and rejecting any hypothesis in which the determined probability that the data values correspond to each of several activity states is indicative of the rules specified for the hypothesis provide a poor match of the data sequence.
 20. The method of evaluating alternative hypothesis in regard to origin of data collected during exploration or production of subterranean resources according to claim 19, further comprising: storing a plurality of stochastic grammars in a knowledge representation for exploration of subterranean resources, wherein each stochastic grammar reflects data-origin-particular activities encountered in the exploration of subterranean resources and the probabilities of transitioning between those activities; and generating the first function from the stochastic grammar. 